feat: Add client interface. #16

sitaowang1998 · 2024-10-31T17:30:44Z

Description

Add client interface.

Validation performed

task:lint

Summary by CodeRabbit

Release Notes

New Features
- Introduced a comprehensive quick start guide for the Spider distributed task executor, detailing task creation, execution, and cluster setup.
- Added a Data class for managing external data storage and a Driver class for client interactions with the Spider framework.
- Implemented a Job class for tracking running tasks and a TaskContext class for managing task execution.
- Added new concepts and type definitions for task management, enhancing the framework's functionality.
Bug Fixes
- Updated task registration macros for consistency across the codebase.
Documentation
- Enhanced documentation with new guides and concepts related to task management.
Refactor
- Improved type handling in function signatures and reorganized include directives for better clarity and efficiency.

…ception

coderabbitai · 2024-10-31T17:30:53Z

Walkthrough

The pull request introduces a quick start guide for the Spider distributed task executor, detailing its core functionalities such as task creation, execution, and cluster setup. It adds several new classes for task management, data handling, and error handling, alongside modifications to the CMake configuration and formatting files. The task registration macro is renamed for consistency, and various include directives are reorganized for improved structure. These changes collectively enhance the usability and modularity of the Spider framework.

Changes

File	Change Summary
docs/quick_start.md	Introduced a quick start guide for Spider, detailing task creation, execution, cluster setup, and recovery features.
src/spider/.clang-format	Updated regex in `IncludeCategories` to match multiple library headers.
src/spider/CMakeLists.txt	Added new libraries `spider_client_lib` and `spider_client`, with updated linking for `spider_worker`.
src/spider/client/Data.hpp	Added `Data` class for managing external data, including methods for locality and cleanup.
src/spider/client/Driver.hpp	Introduced `Driver` class for job creation and key-value store access, with methods for task management.
src/spider/client/Exception.hpp	Added custom exception handling with `ConnectionException` and `DriverIdInUseException`.
src/spider/client/Job.hpp	Added `Job` class template for managing running tasks, including status and result retrieval methods.
src/spider/client/TaskContext.hpp	Introduced `TaskContext` class for managing tasks, including methods for task management and job tracking.
src/spider/client/TaskGraph.hpp	Added `TaskGraph` class to represent a directed acyclic graph of tasks.
src/spider/client/spider.hpp	Added header guard and reorganized includes for better dependency management.
src/spider/client/task.hpp	Introduced concepts and type definitions for task management, including `TaskIo` and `TaskFunction`.
src/spider/client/type_utils.hpp	Added utility for checking type specializations with `IsSpecialization`.
src/spider/core/Data.hpp	Reordered include statements for the Boost UUID library without functional changes.
src/spider/core/Serializer.hpp	Activated UUID header and added serialization concepts for handling UUIDs.
src/spider/core/Task.hpp	Reordered include statements without functional changes.
src/spider/core/TaskGraph.hpp	Reordered include statements without functional changes.
src/spider/storage/DataStorage.hpp	Reorganized include directives for Boost UUID library without functional changes.
src/spider/storage/MetadataStorage.hpp	Reorganized include directives for Boost UUID library without functional changes.
src/spider/storage/MysqlStorage.cpp	Reorganized include statements without functional changes.
src/spider/storage/MysqlStorage.hpp	Reorganized include directives without functional changes.
src/spider/worker/FunctionManager.hpp	Renamed task registration macro and updated type handling in `signature` struct template.
tests/worker/test-FunctionManager.cpp	Updated test cases to use the new task registration macro while preserving existing test logic.

Possibly related PRs

feat: Add MySql support for storage backend #20: This PR introduces MySQL support for the storage backend, which is directly related to the main PR's focus on the Spider framework's task execution and data management, as it enhances the storage capabilities necessary for managing task states and results.
test: Add tests for data storage #21: This PR adds unit tests for data storage, which is relevant as it tests the functionalities introduced in the main PR, ensuring that the new task execution features work correctly with the data storage mechanisms.
test: Add unit tests for metadata storage and fix bugs in MySQL storage backend #23: This PR focuses on adding unit tests for metadata storage and fixing bugs in the MySQL storage backend, which is directly related to the main PR's enhancements in task management and execution, ensuring that the new features are robust and error-free.
feat: Add function manager for register and run function by function name #25: This PR introduces a FunctionManager that allows for the registration and execution of functions by name, which aligns with the main PR's goal of enhancing task execution in the Spider framework, as it provides a mechanism for managing task functions effectively.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

kirkrodrigues

I'm a bit unclear on how all the APIs fit together concretely and I think someone who is seeing these headers for the first time would be a bit lost. I guess we will need to iterate on these a few times to get them to a point where they are clear (and meet the style guide). I think the first step is if you can add one or more readmes describing the layout of these APIs and how a user can use these APIs. Docstrings for each class will also help significantly.

kirkrodrigues · 2024-10-31T20:46:11Z

src/spider/client/Data.hpp

+private:
+    std::unique_ptr<DataImpl> m_impl;


Private declarations should come after public.

src/spider/client/Data.hpp

kirkrodrigues · 2024-10-31T20:49:55Z

src/spider/client/Data.hpp

+     * Gets the values stored in Data.
+     * @return value stored in Data.


Suggested change

* Gets the values stored in Data.

* @return value stored in Data.

* @return The stored value.

As our internal guidelines mention, for most getters, we can omit the docstring description and only describe the return value.

I think this is the special case. This is not an ordinary getter. It accesses data storage under the hood and could throw exception, or return error once we get error included in the interface.

Since neither the exception or error are currently in the code, let's write the docstring according to what exists (not what will exist later). When we eventually add the error information, we can update the docstring appropriately.

kirkrodrigues · 2024-10-31T20:50:27Z

src/spider/client/Data.hpp

+    auto get() -> T;
+    /**


Suggested change

auto get() -> T;

/**

auto get() -> T;

/**

Add an empty line between methods.

kirkrodrigues · 2024-10-31T20:53:05Z

src/spider/client/Data.hpp

+    /**
+     * Indicates that the data is persisted and should not be rollbacked
+     * on failure recovery.
+     */
+    // Not implemented in milestone 1
+    // void mark_persist();


In my experience, it's a bad idea to add code that we'll use in the future since that day may never come. If that day comes, we will remember what to do based on the design doc rather than this commented code.

kirkrodrigues · 2024-10-31T21:26:45Z

src/spider/client/Data.hpp

+         * Sets the key for the data. If no key is provided, Spider generates a key.
+         * @param key of the data
+         */
+        auto key(std::string const& key) -> Data<T>::Builder&;


Return value is missing from docstring.

kirkrodrigues · 2024-10-31T21:27:57Z

src/spider/client/Data.hpp

+    /**
+     * Sets locality list of the data.
+     * @param nodes nodes that has locality
+     * @param hard true if the locality list is a hard requirement, false otherwise


These docstrings are a bit unclear to me. I guess the idea is that if hard=false, the data can be accessed from any node in the cluster, but if possible, it should be accessed from the nodes specified in this list?

Yes. Hard requirement means data can only be accessed on nodes in the locality list. Will change the docstring to make it clear.

kirkrodrigues · 2024-10-31T21:29:19Z

src/spider/client/Data.hpp

+         * Sets the key for the data. If no key is provided, Spider generates a key.
+         * @param key of the data
+         */
+        auto key(std::string const& key) -> Data<T>::Builder&;


I think it would be clearer to name all these setters as set_xxx where xxx is the thing being set.

kirkrodrigues · 2024-10-31T21:29:54Z

src/spider/client/Data.hpp

+        // Not implemented for milestone 1
+        // auto rollback(std::function<const T&()> const& f) -> Data<T>::Builder&;
+        /**
+         * Builds the data. Stores the value of data into storage with locality list, persisted


How does this class store the data?

Internally all values are serialized and stored in data storage.

kirkrodrigues · 2024-10-31T21:38:08Z

src/spider/client/Data.cpp

+
+class DataImpl {};
+
+template <class T>


You should document template parameters as well.

…nciplew

kirkrodrigues

Reviewed quickstart.

kirkrodrigues · 2024-11-02T15:28:33Z

docs/QuickStart.md

Our naming convention is to use kebab-case for markdown files except for the main readme in a repo which is called README.md. So this file should be called quick-start.md. I've added this to our internal guidelines.

kirkrodrigues · 2024-11-02T15:30:35Z

docs/QuickStart.md

@@ -0,0 +1,204 @@
+# Spider Quick Start Guide


Our convention is to use sentence case for headings rather than capitlizing every word. So this should be written as # Spider quick start guide.

I've changed all titles, but I treat Spider as a special terminology and keep it capitalized.

kirkrodrigues · 2024-11-02T15:31:27Z

docs/QuickStart.md

+# Spider Quick Start Guide
+
+## Set Up Spider
+To get started, first start a database supported by Spider, e.g. MySql. Second, start a scheduler and connect it to the database by running `spider start --scheduler --db <db_url> --port <scheduler_port>`. Third, start some workers and connect them to the database by running `spider start --worker --db <db_url>`.


Try to use lists rather than a block of text. Lists are faster to read than blocks of text.

kirkrodrigues · 2024-11-02T15:33:08Z

docs/QuickStart.md

I will add a linter to do this later, but treat Markdown as code and wrap lines to 100 characters. This makes it easier to review and read (not everyone has a good Markdown renderer on their local machine).

kirkrodrigues · 2024-11-02T15:33:53Z

docs/QuickStart.md

+# Spider Quick Start Guide
+
+## Set Up Spider
+To get started, first start a database supported by Spider, e.g. MySql. Second, start a scheduler and connect it to the database by running `spider start --scheduler --db <db_url> --port <scheduler_port>`. Third, start some workers and connect them to the database by running `spider start --worker --db <db_url>`.


You have a caveat for starting the worker below. You should note that here otherwise someone will try to start the worker, it won't work, and they will give up.

kirkrodrigues · 2024-11-02T16:06:01Z

docs/QuickStart.md

+}
+```
+
+However, it is impossible to get the return value of the task graph from a client. We have a solution by sharing data using key-value store, which will be discussed later. Another solution is to run task or task graph inside a task and wait for its value, just like a client. This solution is closer to the conventional function call semantic.


However, it is impossible to get the return value of the task graph from a client.

This sentence doesn't make sense since the previous section has an example where the client is retrieving the result of a task graph.

Reading the later parts of this guide, I guess you mean that you can't return the value of a dynamically created task to the client (which is intuitive since the interface doesn't provide a mechanism to do so).

kirkrodrigues · 2024-11-02T16:09:56Z

docs/QuickStart.md

+
+Spider lets user pass the metadata of these data around in `spider::Data` objects. `Data` stores the value of the metadata information of external data, and provides crucial information to Spider for correct and efficient scheduling and failure recovery. `Data` stores a list of nodes which has locality of the external data, and user can specify if locality is a hard requirement, i.e. task can only run on the nodes in locality list. `Data` can include a `cleanup`function, which will run when the `Data` object is no longer reference by any task and client. `Data` has a persist flag to represent that external data is persisted and do not need to be cleaned up.
+
+```c++


This example is complicated and should definitely have a description. To not have a description means that we're slowing down the user since they first need to infer what the goal of the example's code is, then they can focus on the details of using the framework.

This example is also a good reason why our internal guidelines say that we should organize things from public to private. If you apply that thinking here, we would put main first and then the task methods. That is the order that a user should read the methods (the opposite order requires the user to remember a lot of context before they get to main).

kirkrodrigues · 2024-11-02T16:12:27Z

docs/QuickStart.md

+        .build(HdfsFile { "/path/to/input" });
+    spider::Future<spider::Data<HdfsFile>> future = spider::run(
+        spider::bind(map, filter),
+        input);


input is not a POD, so that violates the constraint mentioned above about task arguments being POD.

kirkrodrigues · 2024-11-02T16:16:01Z

docs/QuickStart.md

+To get started, first start a database supported by Spider, e.g. MySql. Second, start a scheduler and connect it to the database by running `spider start --scheduler --db <db_url> --port <scheduler_port>`. Third, start some workers and connect them to the database by running `spider start --worker --db <db_url>`.
+
+## Start a Client
+Client first creates a Spider client driver and connects it to the database. Spider automatically cleans up the resource in driver's destructor, but you can close the driver to release the resource early.


We should have an intro section where we describe the actors and architecture in the system (client, scheduler, worker, database, etc.). It doesn't need to be too detailed (that can be in another doc), but it should provide enough context for the user to visualize what we're talking about.

kirkrodrigues · 2024-11-02T16:19:05Z

docs/QuickStart.md

+## Note on Worker Setup
+The setup section said that we can start a worker by running `spider start --worker --db <db_url>`. This is oversimplified. The worker has to know the function it will run.
+
+When user compiles the client code, an executable and a library are generated. The executable executes the client code as expected. The library contains all the functions registered by user. Worker needs to run with a copy of this library. The actual commands to start a worker is `spider start --worker --db <db_url> --libs [client_libraries]`.


This makes sense, so in the guide above, wouldn't it be easier to just say that we need to create a task library with all the tasks the user wants to run? Then we can reference those tasks in the client code. Right now, the guide says we need to register the tasks by calling spider::register_task which sounds pretty opaque to me.

User is responsible to create a task library and the client executable. Actually a good practice for user is to put the spider::register_task with the tasks' declaration instead of main, so maybe I should task registration into "Create a task section?

…own at 100 chars

kirkrodrigues · 2024-11-03T10:57:48Z

docs/quick_start.md

+`spider::Data` that is directly used as input. Spider requires that the types of `Task` or
+`TaskGraph` outputs or POD type or `spider::Data` matches the input types of child task.
+
+Binding the tasks together forms a dependencies among tasks, which is represented by


How do you plan to communicate (serialize) the task calls (name + args) from the client to the server?

Let's say a task takes two inputs. Does this interface support using a constant for one input and a task for the other?

If so, do you support a task graph that looks like this?

flowchart TD leaf["foo(int, int) -> int"] parent["bar(int, int) -> int"] 3 --> leaf 4 --> parent 5 --> parent parent --> leaf

Loading

If so, that means any arguments the user passes into run get passed to the inputs in the task graph with a kind of DFS ordering. Is that true?

Here comes the use of spider::register_task, which records the mapping between function name and the function pointer. When the user calls run with the function pointer, the library actually sends the function name to the db. Worker also gets the name of the function as part of the task metadata. Since the worker links to the library with register_task call, it also has the mapping and can know which function to call.
As for arguments, their types are stored in db (serialized as string right now), and values are serialized into string.

Yes.
i. Yes.
ii. No. spider::bind needs to bind all inputs of the child task. Thus, the value 3 must be the an argument in bind. 4 and 5 can be passed in run. However, it is possible that there are multiple first layer tasks, i.e. tasks with no parents. The input of task graph is the product of all first layer tasks' output.
It is possible to support passing 3 as an argument in run for the above example. We can have a special placeholder type for bind. In such case, DFS is an intuitive order.

Can you explain how:

register_task maps a function pointer into a function name?

run turns a function pointer into a function name?

Can you explain this point with an example?

However, it is possible that there are multiple first layer tasks, i.e. tasks with no parents. The input of task graph is the product of all first layer tasks' output.

For client and worker, when the library is loaded, it will create a mapping between function name and function pointer.
i. I am thinking of turning register_task into a macro to get the function name at compile time. It then stores the mapping between the function name to the function pointer.
ii. run gets the function name from the mapping stored before.

In the example below, both bar and baz has no parent, and task graph input is their inputs put together, i.e. 1, 2, 3 and 4.

graph TD 1[1] 2[2] 3[3] 4[4] foo["foo(int, int) -> int"] bar["bar(int, int) -> int"] baz["baz(int, int) -> int"] 1 --> bar 2 --> bar 3 --> baz 4 --> baz bar --> foo baz --> foo

Loading

Makes sense.

Gotcha. So to clarify my understanding:

in run, users can only pass arguments to root tasks (tasks with no parents).

in run, when the user passes a list of arguments [1, 2, 3, 4], conceptually:

the framework will iterate over the root tasks in order;

for each task argument, the framework will pop one argument from the list.

Yes and yes.

…section

… in client id spider::run now returns a Job represents a running task or task graph, so user can cancel the job. User can now pass in client id when creating the driver, so the recovered client can get all jobs by client id.

kirkrodrigues · 2024-11-16T11:15:58Z

Can you make the changes to the header files that we discussed on Thursday?

kirkrodrigues

Some additional comments.

kirkrodrigues · 2024-11-19T13:01:02Z

src/spider/client/Spider.hpp

Since this file doesn't correspond to a class, the filename should be lowercase.

kirkrodrigues · 2024-11-19T13:01:38Z

src/spider/client/Spider.hpp

+#include <optional>
+#include <string>
+
+// NOLINTBEGIN(misc-include-cleaner)


We should be using IWYU's pragmas, right?

kirkrodrigues · 2024-11-19T13:02:56Z

src/spider/client/Spider.hpp

+/**
+ * Initializes Spider library
+ */
+void init();


What will this function do internally?

kirkrodrigues · 2024-11-19T13:07:00Z

src/spider/client/Spider.hpp

+ * Registers function to Spider
+ * @param function function to register
+ */
+template <class R, class... Args>


Let's replace R and Args with C++20 concepts that specify the exact properties of acceptable inputs. If I understand correctly, currently, return values and arguments can be a certain set of serializable values.

On that note though, if the requirement is that the value types need to be serializable, why don't we define an interface for serializable value types and then the user has the flexibility to use any serializable type they want? We could do this later, but conceptually, is it possible?

Let's also use the same C++20 concepts wherever we have task inputs/outputs. This would also simplify the docstrings since we don't need to repeat what types are acceptable.

kirkrodrigues · 2024-11-19T13:11:09Z

src/spider/client/Spider.hpp

+ * @return job representing the running task
+ */
+template <class R, class... Args>
+auto run(std::function<R(Args...)> const& task, Args&&... args) -> Job<R>;


How about renaming the run functions to start? run implies that the function won't return until the task/taskgraph completes.

kirkrodrigues · 2024-11-19T13:14:35Z

src/spider/client/Spider.hpp

+
+// NOLINTEND(misc-include-cleaner)
+
+namespace spider {


Reading the functions in this file and those in KeyValueData, I feel like it makes more sense to have a Client/Driver class that encapsulates the functions (similar to Context). Otherwise, a user could call, for example, insert_kv without initializing a client and then it would fail or invoke undefined behaviour, right?

kirkrodrigues · 2024-11-19T13:17:55Z

src/spider/client/Job.hpp

@@ -0,0 +1,61 @@
+#ifndef SPIDER_CLIENT_FUTURE_HPP


Double-check all the header guards.

kirkrodrigues · 2024-11-19T13:19:05Z

src/spider/client/Data.hpp

+ *
+ * @tparam T type of the value. T must be a POD.
+ */
+template <class T>


Like I mentioned in Spider.hpp, we can restrict the types using C++20 concepts.

kirkrodrigues · 2024-11-19T13:23:58Z

CMakeLists.txt

+# AppleClang complains about file has no symbol and abort the build.
+if(APPLE)
+    set(CMAKE_CXX_ARCHIVE_CREATE "<CMAKE_AR> Scr <TARGET> <LINK_FLAGS> <OBJECTS>")
+    set(CMAKE_CXX_ARCHIVE_FINISH "<CMAKE_RANLIB> -no_warning_for_no_symbols -c <TARGET>")
+endif()


Do we need this anymore considering we've dropped support for building on macOS?

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)

src/spider/client/Data.hpp (1)
46-74: Add validation and improve Builder documentation.

The Builder implementation could benefit from:

Documentation about method chaining

Validation for empty node vectors and null cleanup functions

Consider adding validation in set_locality:
auto set_locality(std::vector<std::string> const& nodes, bool hard) -> Builder& {
    if (nodes.empty()) {
        throw std::invalid_argument("Locality nodes vector cannot be empty");
    }
    // ... existing implementation
}

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 165eb84 and 488eaa3.

📒 Files selected for processing (2)

src/spider/client/Data.hpp (1 hunks)
src/spider/client/TaskContext.hpp (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

src/spider/client/TaskContext.hpp

🔇 Additional comments (5)

src/spider/client/Data.hpp (5)

1-10: LGTM! Header organization follows best practices.

The includes are well-organized and appropriate for the required functionality.

76-78: LGTM! Good use of pImpl idiom.

The use of pImpl pattern effectively hides implementation details and helps maintain ABI compatibility.

31-45: 🛠️ Refactor suggestion

Improve method documentation and ordering.

The get() method should document potential errors or exceptions.
The set_locality() parameter documentation should explain what nodes represents.
Per Google C++ style guide, private declarations should come after public ones.

Consider adding error handling documentation:

/**
 * Retrieves the stored value.
 * @return The stored value.
 * @throws std::runtime_error If the data cannot be accessed or deserialized.
 */
auto get() -> T;

29-29: Verify Serializable concept implementation.

Let's ensure the Serializable concept is properly defined and documented.

✅ Verification successful

Serializable concept is properly implemented and used

The Serializable concept is well-defined in src/spider/core/Serializer.hpp as a composition of two concepts:

SerializableImpl: Requires types to support msgpack serialization
DeSerializableImpl: Requires types to support msgpack deserialization

The concept is appropriately used in the codebase:

As a template constraint for the Data class
As part of the TaskIo concept definition

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for Serializable concept definition
rg -A 5 "concept\s+Serializable"

# Search for any serialization-related functions that types must implement
ast-grep --pattern 'template <$_>
concept Serializable = $_'

Length of output: 592

Script:

#!/bin/bash
# Let's also check for DeSerializableImpl concept definition
rg -A 5 "concept\s+DeSerializableImpl"

# And check where this Serializable concept is being used
rg "Serializable\s*[<\s]" -A 2

Length of output: 930

14-29: ⚠️ Potential issue

Fix inconsistencies in example code.

The example code has several issues:

set_locality example uses a single string, but the method expects a vector
There's a typo in cont& instead of const&
The cleanup function signature doesn't match the method declaration

Apply this diff to fix the example:

-         .set_locality({ "node_address" }, true)
-         .set_cleanup_func([](std::string cont& path) { std::remove(path); })
+         .set_locality(std::vector<std::string>{"node_address"}, true)
+         .set_cleanup_func([](std::string const& path) { std::remove(path.c_str()); })

Likely invalid or redundant comment.

coderabbitai

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (3)

src/spider/client/Data.hpp (3)
29-30: Consider adding error handling mechanism.

The Serializable concept constraint is good, but the class should provide a way to handle potential serialization errors.

Consider adding error handling through either:

Exception handling

std::expected (C++23) or tl::expected

Error codes

37-44: Enhance documentation for locality parameters.

The parameter documentation could be more descriptive:

What constitutes a valid node address?

What happens if an invalid node is specified?

What are the implications of hard vs soft locality?

46-74: Consider adding noexcept specifications and move operations.

The Builder implementation could be enhanced:

Mark methods as noexcept where applicable

Add move operations for better performance

Consider making cleanup function noexcept

Example improvement:
auto set_cleanup_func(std::function<void(T const&) noexcept> f) noexcept -> Builder&;

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 488eaa3 and 4d2aa6c.

📒 Files selected for processing (6)

src/spider/client/Data.hpp (1 hunks)
src/spider/client/Driver.hpp (1 hunks)
src/spider/client/Job.hpp (1 hunks)
src/spider/client/TaskContext.hpp (1 hunks)
src/spider/client/task.hpp (1 hunks)
src/spider/client/type_utils.hpp (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (5)

src/spider/client/Driver.hpp
src/spider/client/Job.hpp
src/spider/client/TaskContext.hpp
src/spider/client/task.hpp
src/spider/client/type_utils.hpp

🔇 Additional comments (2)

src/spider/client/Data.hpp (2)

1-10: LGTM! Well-organized includes and proper header guards.

The includes are properly organized with standard libraries first, followed by local includes, and all are necessary for the implementation.

20-25: ⚠️ Potential issue

Fix inconsistencies in the example code.

The example code doesn't match the actual method signatures:

The cleanup function lambda has incorrect signature
The locality parameter is passed as a single string instead of a vector

Apply this diff to fix the example:

 auto disk_file_data = spider::Data<std::string>::Builder()
-         .set_locality({"node_address"}, true)
-         .set_cleanup_func([](std::string cont& path) { std::remove(path); })
+         .set_locality(std::vector<std::string>{"node_address"}, true)
+         .set_cleanup_func([](std::string const& path) { std::remove(path.c_str()); })
          .build("/path/of/file");

Likely invalid or redundant comment.

coderabbitai · 2024-11-29T16:51:56Z

src/spider/client/Data.hpp

+private:
+    std::unique_ptr<DataImpl> m_impl;
+};


⚠️ Potential issue

Add destructor declaration for pImpl idiom.

When using the pImpl idiom with std::unique_ptr, you need to declare the destructor in the header and define it in the implementation file where the complete type is available.

Add the following declaration:

~Data(); // Declaration only, define in .cpp file

coderabbitai · 2024-11-29T16:51:56Z

src/spider/client/Data.hpp

+#ifndef SPIDER_CLIENT_DATA_HPP
+#define SPIDER_CLIENT_DATA_HPP
+
+#include <functional>
+#include <memory>
+#include <string>
+#include <vector>
+
+#include "../core/Serializer.hpp"
+
+namespace spider {
+class DataImpl;
+
+/**
+ * A representation of data stored on external storage. This class allows the user to define:
+ * - how the data should be cleaned up (garbage collected) once it is no longer referenced.
+ * - the locality of the data.
+ *
+ * Example:
+ * @code{.cpp}
+ * auto disk_file_data = spider::Data<std::string>::Builder()
+ *         .set_locality({"node_address"}, true)
+ *         .set_cleanup_func([](std::string cont& path) { std::remove(path); })
+ *         .build("/path/of/file");
+ * @endcode
+ *
+ * @tparam T Type of the value.
+ */
+template <Serializable T>
+class Data {
+public:
+    /**
+     * @return The stored value.
+     */
+    auto get() -> T;
+
+    /**
+     * Sets the data's locality, indicated by the nodes that contain the data.
+     *
+     * @param nodes
+     * @param hard Whether the data is only accessible from the given nodes (i.e., the locality is a
+     * hard requirement).
+     */
+    void set_locality(std::vector<std::string> const& nodes, bool hard);
+
+    class Builder {
+    public:
+        /**
+         * Sets the data's locality, indicated by the nodes that contain the data.
+         *
+         * @param nodes
+         * @param hard Whether the data is only accessible from the given nodes (i.e., the locality
+         * is a hard requirement.
+         * @return self
+         */
+        auto set_locality(std::vector<std::string> const& nodes, bool hard) -> Builder&;
+
+        /**
+         * Sets the cleanup function for the data. This function will be called when the data is no
+         * longer referenced.
+         *
+         * @param f
+         * @return self
+         */
+        auto set_cleanup_func(std::function<void(T const&)> const& f) -> Builder&;
+
+        /**
+         * Builds the data object.
+         *
+         * @param t Value of the data
+         * @return The built object.
+         */
+        auto build(T const& t) -> Data;
+    };
+
+private:
+    std::unique_ptr<DataImpl> m_impl;
+};
+}  // namespace spider
+
+#endif  // SPIDER_CLIENT_DATA_HPP


💡 Codebase verification

Template implementation file missing

The code appears to be a header-only template library where:

Data.hpp in the client directory defines the template interface

Data.hpp in the core directory contains a different concrete implementation

The template implementation is missing, and DataImpl is only forward-declared

Required actions:

Implement the template methods for Data<T> either in the header file or in a separate .ipp file

Provide the complete implementation of the DataImpl class

🔗 Analysis chain

Verify implementation file exists.

Ensure that all declared methods are implemented in the corresponding .cpp file.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Check for the implementation file fd "Data\.(cpp|cc|cxx)$" src/spider/client/ # Check if all methods are implemented rg "Data<.*>::\w+" src/spider/client/

Length of output: 174

Script:

#!/bin/bash # Let's check if there's a template implementation in the header file itself rg -U "template.*class Data.*\{.*\}" -A 100 src/spider/client/Data.hpp # Also check if there's any inline implementation rg "inline.*Data<.*>::\w+" src/spider/client/ # Check for potential implementation in other directories fd "Data(Impl)?\.(cpp|cc|cxx|hpp|h)$" src/spider/

Length of output: 224

Script:

#!/bin/bash # Let's examine the core/Data.hpp to understand if it contains the implementation cat src/spider/core/Data.hpp # Also check for DataImpl definition rg "class DataImpl" src/spider/

Length of output: 2017

kirkrodrigues

All of the Driver and TaskContext methods should throw an exception if they can't connect to the storage component, right? If so, can we document that with @throw?
Do we need any of the forward declared XxxImpl classes in this PR?
I'm still working on the quick-start guide, so in the interest of time, can we remove it from this pull request and then I'll submit another pull request to add it?

kirkrodrigues · 2024-11-29T16:49:45Z

src/spider/client/Driver.hpp

+     * fails and then needs to reconnect and retrieve all previously created jobs. NOTE: It is
+     * undefined behaviour for two clients to concurrently use the same ID.
+     */
+    Driver(std::string const& storage_url, boost::uuids::uuid id);


This method should throw an exception if the client ID is already in use, right?

kirkrodrigues · 2024-11-29T16:52:04Z

src/spider/client/TaskContext.hpp

+#include "TaskGraph.hpp"
+
+namespace spider {
+class ContextImpl;


Do we need this in this PR?

sitaowang1998 · 2024-11-29T17:01:44Z

We can add throw, and remove it later if we have the error code ready. Is there other exception that should be thrown other than connection error?
I'll remove all pImpl forward declaration.
I'll remove the quick start guide.

kirkrodrigues · 2024-11-29T17:03:43Z

As far as I can tell, just the connection error one and the client ID already being in use.

coderabbitai

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (2)

src/spider/client/Data.hpp (2)
37-44: Enhance locality documentation

The documentation should clarify:

The implications of hard vs soft locality

The behaviour when accessing data from non-specified nodes

Whether locality can be changed after data is created

46-74: Enhance Builder implementation

Consider the following improvements:

Add move semantics support

Add parameter validation

Make Builder moveable but non-copyable
class Builder {
public:
    Builder() = default;
    Builder(Builder&&) noexcept = default;
    Builder& operator=(Builder&&) noexcept = default;
    Builder(const Builder&) = delete;
    Builder& operator=(const Builder&) = delete;
    // ... existing methods ...
private:
    void validate_nodes(std::vector<std::string> const& nodes) {
        if (nodes.empty()) {
            throw std::invalid_argument("Nodes vector cannot be empty");
        }
    }
};

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 4d2aa6c and 88ed638.

📒 Files selected for processing (1)

src/spider/client/Data.hpp (1 hunks)

🔇 Additional comments (3)

src/spider/client/Data.hpp (3)

20-25: Update example code to match method signatures.

The example code doesn't match the actual method signatures:

set_locality expects a vector of strings, not a single string
set_cleanup_func signature is void(T const&), which matches the example but differs from what was previously discussed

- auto disk_file_data = spider::Data<std::string>::Builder()
-         .set_locality({"node_address"}, true)
-         .set_cleanup_func([](std::string const& path) { std::filesystem::remove(path); })
-         .build("/path/of/file");
+ auto disk_file_data = spider::Data<std::string>::Builder()
+         .set_locality(std::vector<std::string>{"node_address"}, true)
+         .set_cleanup_func([](std::string const& path) { std::filesystem::remove(path); })
+         .build("/path/of/file");

76-78: Add destructor declaration for pImpl idiom

When using the pImpl idiom with std::unique_ptr, you need to declare the destructor in the header.

 private:
     std::unique_ptr<DataImpl> m_impl;
+    ~Data();

29-30: Template implementation file missing

The template implementation is missing. Consider:

Adding implementation in a separate .ipp file
Including the .ipp file at the end of this header

coderabbitai · 2024-11-29T17:03:44Z

src/spider/client/Data.hpp

+    /**
+     * @return The stored value.
+     */
+    auto get() -> T;


🛠️ Refactor suggestion

Consider adding error handling to get()

The get() method might fail when accessing the underlying storage. Consider:

Adding error handling using std::expected (C++23) or Result type

Documenting potential failure scenarios

coderabbitai

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (3)

src/spider/client/Exception.hpp (1)
11-30: Consider implementing a base exception class.

To improve maintainability and reduce code duplication, consider implementing a base Spider exception class that handles common functionality like message formatting and storage. This would also make it easier to add new exception types in the future.

Example structure:
class SpiderException : public std::exception {
protected:
    explicit SpiderException(std::string message) : m_message(std::move(message)) {}
    mutable std::string m_message;

public:
    [[nodiscard]] const char* what() const noexcept override {
        return m_message.c_str();
    }
};

class ConnectionException final : public SpiderException {
public:
    explicit ConnectionException(std::string addr)
        : SpiderException(fmt::format("Cannot connect to storage {}.", addr)) {}
};
src/spider/client/Data.hpp (1)
37-44: Improve parameter names and documentation for set_locality

The parameter names and documentation could be more descriptive:

Rename nodes to node_addresses for clarity

Expand documentation to explain the implications of hard vs soft locality requirements
-     * @param nodes
+     * @param node_addresses List of node addresses where the data is stored
-     * @param hard Whether the data is only accessible from the given nodes (i.e., the locality is a
-     * hard requirement).
+     * @param hard If true, data can only be accessed from the specified nodes.
+     *             If false, data can be accessed from any node but preferably from the specified nodes.
src/spider/client/Job.hpp (1)

12-14: Address the TODO: Implement error handling with std::expected or Boost.Outcome

The TODO comment suggests enhancing error handling by using std::expected or Boost.Outcome. Implementing this would allow users to retrieve job results in a more robust and less error-prone manner, simplifying the API.

Would you like assistance in implementing this enhancement or opening a GitHub issue to track this task?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 88ed638 and 73eabef.

📒 Files selected for processing (6)

src/spider/client/Data.hpp (1 hunks)
src/spider/client/Driver.hpp (1 hunks)
src/spider/client/Exception.hpp (1 hunks)
src/spider/client/Job.hpp (1 hunks)
src/spider/client/TaskContext.hpp (1 hunks)
src/spider/client/TaskGraph.hpp (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

src/spider/client/Driver.hpp
src/spider/client/TaskContext.hpp
src/spider/client/TaskGraph.hpp

🔇 Additional comments (7)

src/spider/client/Exception.hpp (1)

1-10: LGTM! Header guards and includes are properly implemented.

The header guards follow the standard naming convention, and all necessary headers are included for the required functionality.

src/spider/client/Data.hpp (4)

20-25: Update example code to match method signatures.

The example code doesn't match the actual method signatures:

set_cleanup_func expects void(T const&) but example shows filesystem operation
set_locality is shown with a single string but expects vector

32-35: Consider adding error handling to get()

The method should handle potential failures when accessing the underlying storage. Consider using std::expected (C++23) or a Result type.

78-80: Add destructor declaration for pImpl idiom

When using the pImpl idiom with std::unique_ptr, you need to declare the destructor in the header file.

29-80: Template implementation file missing

The template implementation is missing:

Methods are declared but not implemented
Implementation must be available at compile time

src/spider/client/Job.hpp (2)

60-60: Consider adding runtime checks to prevent undefined behaviour

As previously noted, the methods get_result() and get_error() may cause undefined behaviour if called when the job is not in the correct state. Adding runtime checks to throw exceptions in these methods would improve robustness.

Also applies to: 71-71

36-36: Provide implementations for template methods to prevent linker errors

As previously mentioned, the methods declared in the Job template class lack implementations. This could lead to linker errors. Consider implementing these methods inline or in an included template implementation file.

Also applies to: 43-43, 50-50, 60-60, 71-71

coderabbitai · 2024-11-29T19:20:59Z

src/spider/client/Exception.hpp

+class ConnectionException final : std::exception {
+public:
+    auto what() -> std::string { return fmt::format("Cannot connect to storage {}.", m_addr); }
+
+private:
+    std::string m_addr;
+};


⚠️ Potential issue

Critical issues in ConnectionException implementation.

The class has several issues that need to be addressed:

Inheritance from std::exception should be public

what() method doesn't properly override std::exception::what()

Missing constructor to initialize m_addr

Missing address parameter in error message

Apply this diff to fix the issues:

-class ConnectionException final : std::exception { +class ConnectionException final : public std::exception { public: - auto what() -> std::string { return fmt::format("Cannot connect to storage {}.", m_addr); } + explicit ConnectionException(std::string addr) : m_addr(std::move(addr)) {} + + [[nodiscard]] const char* what() const noexcept override { + m_what = fmt::format("Cannot connect to storage {}.", m_addr); + return m_what.c_str(); + } private: std::string m_addr; + mutable std::string m_what; // Mutable to allow modification in const what() };

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

class ConnectionException final : std::exception {

public:

auto what() -> std::string { return fmt::format("Cannot connect to storage {}.", m_addr); }

private:

std::string m_addr;

};

class ConnectionException final : public std::exception {

public:

explicit ConnectionException(std::string addr) : m_addr(std::move(addr)) {}

[[nodiscard]] const char* what() const noexcept override {

m_what = fmt::format("Cannot connect to storage {}.", m_addr);

return m_what.c_str();

}

private:

std::string m_addr;

mutable std::string m_what; // Mutable to allow modification in const what()

};

coderabbitai · 2024-11-29T19:20:59Z

src/spider/client/Exception.hpp

+class DriverIdUsedException final : std::exception {
+public:
+    auto what() -> std::string {
+        return fmt::format("Driver id {} already used.", boost::uuids::to_string(m_id));
+    }
+
+private:
+    boost::uuids::uuid m_id;
+};


⚠️ Potential issue

Critical issues in DriverIdUsedException implementation.

The class has similar issues to ConnectionException:

Inheritance from std::exception should be public

what() method doesn't properly override std::exception::what()

Missing constructor to initialize m_id

Apply this diff to fix the issues:

-class DriverIdUsedException final : std::exception { +class DriverIdUsedException final : public std::exception { public: - auto what() -> std::string { - return fmt::format("Driver id {} already used.", boost::uuids::to_string(m_id)); - } + explicit DriverIdUsedException(boost::uuids::uuid id) : m_id(id) {} + + [[nodiscard]] const char* what() const noexcept override { + m_what = fmt::format("Driver id {} already used.", boost::uuids::to_string(m_id)); + return m_what.c_str(); + } private: boost::uuids::uuid m_id; + mutable std::string m_what; // Mutable to allow modification in const what() };

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

class DriverIdUsedException final : std::exception {

public:

auto what() -> std::string {

return fmt::format("Driver id {} already used.", boost::uuids::to_string(m_id));

}

private:

boost::uuids::uuid m_id;

};

class DriverIdUsedException final : public std::exception {

public:

explicit DriverIdUsedException(boost::uuids::uuid id) : m_id(id) {}

[[nodiscard]] const char* what() const noexcept override {

m_what = fmt::format("Driver id {} already used.", boost::uuids::to_string(m_id));

return m_what.c_str();

}

private:

boost::uuids::uuid m_id;

mutable std::string m_what; // Mutable to allow modification in const what()

};

coderabbitai · 2024-11-29T19:20:59Z

src/spider/client/Job.hpp

+
+    /**
+     * NOTE: It is undefined behavior to call this method for a job that is not in the `Succeed`
+     * state.


⚠️ Potential issue

Correct the job state names in the documentation

There are inconsistencies in the documentation comments regarding job state names:

In line 54, 'Succeed' should be 'Succeeded' to match the enum JobStatus::Succeeded.

In line 63, 'Fail' should be 'Failed' to match the enum JobStatus::Failed.

Apply the following changes to correct the documentation:

- * NOTE: It is undefined behaviour to call this method for a job that is not in the `Succeed` state. + * NOTE: It is undefined behaviour to call this method for a job that is not in the `Succeeded` state.

- * NOTE: It is undefined behaviour to call this method for a job that is not in the `Fail` state. + * NOTE: It is undefined behaviour to call this method for a job that is not in the `Failed` state.

Also applies to: 63-63

kirkrodrigues · 2024-11-29T21:31:03Z

docs/quick_start.md

You still need to remove this, right?

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (2)

src/spider/client/Exception.hpp (2)
14-15: Consider using std::string_view for better performance.

The constructor could be more efficient by accepting a std::string_view instead of const std::string&, especially since the parameter is only used for formatting.
-    explicit ConnectionException(std::string const& addr)
+    explicit ConnectionException(std::string_view addr)
26-28: Improve message formatting consistency.

The message formatting spans multiple lines with inconsistent indentation. Consider reformatting for better readability:
-            : m_message(
-                      fmt::format("Driver ID {} is currently in use.", boost::uuids::to_string(id))
-              ) {}
+            : m_message(fmt::format("Driver ID {} is currently in use.",
+                                   boost::uuids::to_string(id))) {}

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 61be939 and 5bffeee.

📒 Files selected for processing (6)

src/spider/client/Data.hpp (1 hunks)
src/spider/client/Driver.hpp (1 hunks)
src/spider/client/Exception.hpp (1 hunks)
src/spider/client/Job.hpp (1 hunks)
src/spider/client/TaskContext.hpp (1 hunks)
src/spider/client/TaskGraph.hpp (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (5)

src/spider/client/Data.hpp
src/spider/client/Driver.hpp
src/spider/client/Job.hpp
src/spider/client/TaskContext.hpp
src/spider/client/TaskGraph.hpp

🔇 Additional comments (2)

src/spider/client/Exception.hpp (2)

1-10: LGTM! Header guards and includes are well-organized.

The includes are properly organized with standard library headers first, followed by third-party libraries. All necessary dependencies are included.

11-35: Well-designed exception hierarchy!

The implementation follows good C++ practices with:

Proper inheritance from std::exception
Consistent design patterns across exception classes
Thread-safe message handling
Clear and informative error messages

sitaowang1998 added 2 commits October 31, 2024 20:36

feat: Add client code structure and interface for Data, Future and ex…

6ffea6e

…ception

feat: Split client into two libraries and add interface

94af15b

kirkrodrigues requested changes Oct 31, 2024

View reviewed changes

sitaowang1998 added 6 commits November 1, 2024 09:14

fix: Add boost library for spider_client_lib

f69523a

style: Improve code style for data based on pr comments

ccf6cc8

fix: Add absl as public library for core

5e26f58

style: Improve code style for client interface based on pr reviea pri…

020093c

…nciplew

fix: Try fix clang-tidy find nout found

ee222f0

docs: Add quick start doc

1b0ccac

kirkrodrigues requested changes Nov 2, 2024

View reviewed changes

sitaowang1998 added 2 commits November 3, 2024 13:45

style: Change markdown headings to sentence style and hard wrap markd…

b3a2e1e

…own at 100 chars

docs: Update doc according to pr comments

ec8f500

kirkrodrigues reviewed Nov 3, 2024

View reviewed changes

sitaowang1998 added 2 commits November 3, 2024 23:18

docs: Remove the worker note section and put the content in run task …

5dc12cb

…section

sitaowang1998 requested a review from kirkrodrigues November 6, 2024 01:28

sitaowang1998 and others added 3 commits November 6, 2024 19:50

Merge branch 'main' into interface

4cd6233

Change future to job

fec5e73

Change task to context

a5e799b

sitaowang1998 and others added 3 commits November 16, 2024 14:52

Remove TaskGraph::run to simplify interface

1b15b5d

Add separate key-value store interface

98104f1

Edit some docstrings.

cb369fd

kirkrodrigues requested changes Nov 19, 2024

View reviewed changes

sitaowang1998 added 5 commits November 19, 2024 23:18

Fix include guard

b4a6f36

Merge branch 'main' into interface

f3de2ca

Add serialzable concept

70547ae

Merge remote-tracking branch 'origin/interface' into interface

525311c

Fix clang-tidy

c776376

sitaowang1998 added 5 commits November 29, 2024 09:14

Fix typo

165eb84

Fix clean up function signature

06de774

Fix set_locality argument in docstring example

c7a07b1

Add void return type for kv_store_insert

f8c623a

Add noreturn and void return type for TaskContext::abort

488eaa3

coderabbitai bot reviewed Nov 29, 2024

View reviewed changes

kirkrodrigues added 2 commits November 29, 2024 11:48

Fix some header guards.

f0729d9

Edit some docstrings and comments.

4d2aa6c

coderabbitai bot reviewed Nov 29, 2024

View reviewed changes

kirkrodrigues reviewed Nov 29, 2024

View reviewed changes

Fix typo in Data docstring example.

88ed638

coderabbitai bot reviewed Nov 29, 2024

View reviewed changes

sitaowang1998 added 3 commits November 29, 2024 19:03

Add exception in docstring

bd55552

Remove pImpl in interface

9897995

Fix clang-tidy

73eabef

coderabbitai bot reviewed Nov 29, 2024

View reviewed changes

sitaowang1998 added 2 commits November 29, 2024 19:32

Fix exception what

85d2475

Fix docstring job state name

61be939

sitaowang1998 requested a review from kirkrodrigues November 29, 2024 20:30

Refactor exceptions.

5bffeee

kirkrodrigues requested changes Nov 29, 2024

View reviewed changes

docs/quick_start.md Outdated

Copy link

Member

kirkrodrigues Nov 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You still need to remove this, right?

coderabbitai bot reviewed Nov 29, 2024

View reviewed changes

Remove quick start guide

22f370d

sitaowang1998 requested a review from kirkrodrigues November 29, 2024 22:44

kirkrodrigues changed the title ~~feat: Add client interface~~ feat: Add client interface. Nov 30, 2024

kirkrodrigues approved these changes Nov 30, 2024

View reviewed changes

sitaowang1998 merged commit 326e2f9 into y-scope:main Nov 30, 2024
4 checks passed

sitaowang1998 deleted the interface branch November 30, 2024 03:55

		* Gets the values stored in Data.
		* @return value stored in Data.

	* Gets the values stored in Data.
	* @return value stored in Data.
	* @return The stored value.


		Spider lets user pass the metadata of these data around in `spider::Data` objects. `Data` stores the value of the metadata information of external data, and provides crucial information to Spider for correct and efficient scheduling and failure recovery. `Data` stores a list of nodes which has locality of the external data, and user can specify if locality is a hard requirement, i.e. task can only run on the nodes in locality list. `Data` can include a `cleanup`function, which will run when the `Data` object is no longer reference by any task and client. `Data` has a persist flag to represent that external data is persisted and do not need to be cleaned up.

		```c++

feat: Add client interface. #16

feat: Add client interface. #16

Conversation

sitaowang1998 commented Oct 31, 2024 • edited Loading

Description

Validation performed

Summary by CodeRabbit

Release Notes

coderabbitai bot commented Oct 31, 2024 • edited Loading

Walkthrough

Changes

Possibly related PRs

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

kirkrodrigues left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kirkrodrigues left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kirkrodrigues commented Nov 16, 2024

kirkrodrigues left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 29, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 29, 2024

Choose a reason for hiding this comment

kirkrodrigues left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sitaowang1998 commented Nov 29, 2024

kirkrodrigues commented Nov 29, 2024

sitaowang1998 commented Oct 31, 2024 •

edited

Loading

coderabbitai bot commented Oct 31, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)